Fisher-Rao Metric, Geometry, and Complexity of Neural Networks
نویسندگان
چکیده
Abstract. We study the relationship between geometry and capacity measures for deep neural networks from an invariance viewpoint. We introduce a new notion of capacity — the Fisher-Rao norm — that possesses desirable invariance properties and is motivated by Information Geometry. We discover an analytical characterization of the new capacity measure, through which we establish norm-comparison inequalities and further show that the new measure serves as an umbrella for several existing norm-based complexity measures. We discuss upper bounds on the generalization error induced by the proposed measure. Extensive numerical experiments on CIFAR-10 support our theoretical findings. Our theoretical analysis rests on a key structural lemma about partial derivatives of multi-layer rectifier networks.
منابع مشابه
Relative Fisher Information and Natural Gradient for Learning Large Modular Models
Fisher information and natural gradient provided deep insights and powerful tools to artificial neural networks. However related analysis becomes more and more difficult as the learner’s structure turns large and complex. This paper makes a preliminary step towards a new direction. We extract a local component from a large neural system, and define its relative Fisher information metric that de...
متن کاملGeoSeq2Seq: Information Geometric Sequence-to-Sequence Networks
The Fisher information metric is an important foundation of information geometry, wherein it allows us to approximate the local geometry of a probability distribution. Recurrent neural networks such as the Sequence-to-Sequence (Seq2Seq) networks that have lately been used to yield state-of-the-art performance on speech translation or image captioning have so far ignored the geometry of the late...
متن کاملUniqueness of the Fisher–rao Metric on the Space of Smooth Densities
On a closed manifold of dimension greater than one, every smooth weak Riemannian metric on the space of smooth positive probability densities, that is invariant under the action of the diffeomorphism group, is a multiple of the Fisher–Rao metric. Introduction. The Fisher–Rao metric on the space Prob(M) of probability densities is of importance in the field of information geometry. Restricted to...
متن کاملInformation metric from a linear sigma model.
The idea that a space-time metric emerges as a Fisher-Rao "information metric" of instanton moduli space has been examined in several field theories, such as the Yang-Mills theories and nonlinear σ models. In this paper, we report that the flat Euclidean or Minkowskian metric, rather than an anti-de Sitter metric that generically emerges from instanton moduli spaces, can be obtained as the Fish...
متن کاملRelative Natural Gradient for Learning Large Complex Models
Fisher information and natural gradient provided deep insights and powerful tools to artificial neural networks. However related analysis becomes more and more difficult as the learner’s structure turns large and complex. This paper makes a preliminary step towards a new direction. We extract a local component of a large neuron system, and defines its relative Fisher information metric that des...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1711.01530 شماره
صفحات -
تاریخ انتشار 2017